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Two-way source coding with a helper 

Haim Permuter, Yossef Steinberg and Tsachy Weissman 

Abstract 

Consider the two-way rate-distortion problem in which a helper sends a common limited-rate message to both 
users based on side information at its disposal. We characterize the region of achievable rates and distortions where a 
Markov form (Helper)-(User l)-(User 2) holds. The main insight of the result is that in order to achieve the optimal 
rate, the helper may use a binning scheme, as in Wyner-Ziv, where the side information at the decoder is the "further" 
user, namely, User 2. We derive these regions explicitly for the Gaussian sources with square error distortion, analyze 
a trade-off between the rate from the helper and the rate from the source, and examine a special case where the 
helper has the freedom to send different messages, at different rates, to the encoder and the decoder The converse 
proofs use a new technique for verifying Markov relations via undirected graphs. 

Index Terms 

Rate-distortion, two-way rate distortion, undirected graphs, verification of Markov relations, Wyner-Ziv source 
coding. 

I. Introduction 

In this paper, we consider the problem of two-way source encoding with a fidelity criterion in a situation where 
both users receive a common message from a helper. The problem is presented in Fig. [T] Note that the case in 
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Fig. 1 . The two-way rate distortion problem with a helper. First Helper Y sends a common message to User X and to User Z, then User Z sends 
a message to User X, and finally User X sends a message to User Z. The goal is that User X will reconstruct the sequence Z" within a fidelity 
criterion IE '^^ 1 -^j)] ^ ^r"l User Z will reconstruct the source X" within a fidelity criterion |^5Z"=i '^x (Xi , < Dx. 

We assume that the side information Y and the two sources X, Z are i.i.d. and form the Markov chain Y — X — Z. 

which the helper is absent was introduced and solved by Kaspi [1]. 
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The encoding and decoding is done in blocks of length n. The communication protocol is that Helper Y first 
sends a common message at rate i?i to User X and to User Z, and then User Z sends a message at rate i?2 to User 
X, and finally, User X sends a message to User Z at rate R3. Note that user Z sends his message after it received 
only one message, while Sender X sends its message after it received two messages. We assume that the sources 
and the helper sequences are i.i.d. and form the Markov chain Y — X — Z. User X receives two messages (one 
from the helper and one from User Z) and reconstructs the source Z". We assume that the fidelity (or distortion) 



is of the form E 



n X^ILl d-z{Zi, Zi) 



and that this term should be less than a threshold D^. User Z also receives 



two messages (one from the helper and one from User X) and reconstructs the source X". The reconstruction X" 
must lie within a fidelity criterion of the form dxiXijXi) < D^- 

Our main result in this paper is that the achievable region for this problem is given by TZ{Dx, Dz), which is 
defined as the set of all rate triples (_Ri, R2, R3) that satisfy 

Ri > I{Y;U\Z), (1) 

R2 > I{Z;V\U,X), (2) 

i?3 > I{X;W\U,V,Z), (3) 

for some joint distribution of the form 

p{x, y, z, u, V, w) = p{x, y)p{z\x)p{u\y)p{v\u, z)p{w\u, v, x), (4) 

where U, V and W are auxiliary random variables with bounded cardinality. The reconstruction variable Z is a 
deterministic function of the triple ([/, V, X), and the reconstruction X is a deterministic function of the triple 
{U, W, Z) such that 

¥.dx{X,X{U,V,Z)) < D,, 

Edz{Z,Z{U,W,X)) < D,. (5) 

The main insight gained from this region is that the helper may use a code based on binning that is designed for a 
decoder with side information, as in Wyner and Ziv [2]. User X and User Z do not have the same side information, 
but it is sufficient to design the helper's code assuming that the side information at the decoder is the one that is 
"further" in the Markov chain, namely, Z. Since a distribution of the form dU implies that I{U ; Z) < I{U; X), a 
Wyner-Ziv code at rate Ri > I{Y; U\Z) would be decoded successfully both by User Z and by User X. Once the 
helper's message has been decoded by both users, a two-way source coding is performed where both users have 
additional side information [/". 

Several papers on related problems have appeared in the past in the literature. Wyner [3] studied a problem of 
network source coding with compressed side information that is provided only to the decoders. A special case of 
his model is the system in Fig. [T]but without the memoryless side information Z and where the stream carrying 
the helper's message arrives only at the decoder (User Z). A full characterization of the achievable region can be 
concluded from the results of [3] for the special case where the source X has to be reconstructed losslessly. This 
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problem was solved independently by Ahlswede and Korner in [4], but the extension of these results to the case 
of lossy reconstruction of X remains open. Kaspi [5] and Kaspi and Berger [6] derived an achievable region for 
a problem that contains the helper problem with degenerate Z as a special case. However, the converse part does 
not match. In [7], Vasudevan and Perron described a general rate distortion problem with encoder breakdown and 
there they solved the case where in Fig. [T]one of the sources is a constanj^. 

Berger and Yeung [9] solved the multi-terminal source coding problem where one of the two sources needs 
to be reconstructed perfectly and the other source needs to be reconstructed with a fidelity criterion. Oohama 
solved the multi-terminal source coding case for the two [10] and i + 1 [11] Gaussian sources, in which only one 
source needs to be reconstructed with a mean square error, that is, the other L sources are helpers. More recently, 
Wagner, Tavildar, and Viswanath characterized the region where both sources [12] or L + 1 sources [13] need to 
be reconstructed at the decoder with a mean square error criterion. 

In [1], Kaspi has introduced a multistage communication between two users, where each user may transmit up 
to K messages to the other user that depends on the source and previous received messages. In this paper we 
also consider the multi-stage source coding with a common helper The case where a helper is absent and the 
communication between the users is via memoryless channels was recently solved by Maor and Merhav [14] where 
they showed that a source channel separation theorem holds. 

The remainder of the paper is organized as follows. In Section HI] we present a new technique for verifying 
Markov relations between random variables based on undirected graphs. The technique is used throughout the 
converse proofs. The problem definition and the achievable region for two way rate distortion problem with a 
common helper are presented in Section |lll] Then we consider two special cases, first in Section |IV] we consider 
the case of i?2 = and = oo, and in Section [V] we consider = and = oo. The proofs of these two 
special cases provide the insight and the tricks that are used in the proof of the general two-way rate distortion 
problem with a helper. The proof of the achievable region for the two-way rate distortion problem with a helper 
is given in Section [Vl] and it is extended to a multi-stage two way rate distortion with a helper in Section IVIII In 
Section IVIIII we consider the Gauissan instance of the problem and derive the region explicitly. In Section |IX] we 
return to the special case where R2 = and = 00 and analyze the trade-off between the bits from the helper 
and bits from source and gain insight for the case where the helper sends different messages to each user, which 
is an open problem. 

II. Preliminary: A technique for checking Markov relations 

Here we present a new technique, based on undirected graphs, that provides a sufficient condition for establishing 
a Markov chain from a joint distribution. We use this technique throughout the paper to verify Markov relations. 
A different technique using directed graphs was introduced by Pearl [15, Ch 1.2], [16]. 

' The case where one of the sources is constant was also considered independently in [8]. 
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Assume we have a set of random variables {Xi,X2, ...,Xjv), where N is the size of the set. Without loss of 
generality, we assume that the joint distribution has the form 

P{x^) = f{xs,)f{xs,)---f{xs^), (6) 

where X5. = {Xj}j^s., where Si is a subset of {1, 2, . . . , A^}. The following graphical technique provides a 
sufficient condition for the Markov relation Xg-^ — Xg^ — Xg^, where Xg., i = 1, 2, 3 denote three disjoint subsets 
ofX^. 

The technique comprises two steps: 

1) draw an undirected graph where all the random variables X^ are nodes in the graph and for all i = 1,2, ..K 
draw edges between all the nodes Xs^, 

2) if aU paths in the graph from a node in Xg^ to a node in Xg.^ pass through a node in Xg^, then the Markov 
chain Xg^ — Xg^ — Xg.^ holds. 

Xi X2 




Fig. 2. The undirected grapli that corresponds to the joint distribution given in 0. The Markov form Xi - X2 - Z2 holds since all paths 
from Xi to Z2 pass thi'ough X2. The node with the open circle, i.e., o, is the middle term in the Markov chain and all the other nodes are 
with solid circles, i.e., •. 

Example 1: Consider the joint distribution 

p{x^,y'^,z^) = p{xi,y2)p{yi,X2)p{zi\xi,X2)p{z2\yi). (7) 

Fig. [2] illustrates the above technique for verifying the Markov relation Xi — X2 — ^2- We conclude that since all 
the paths from Xi to Z2 pass through X2, the Markov chain Xi — X2 — Z2 holds. 

The proof of the technique is based on the observation that if three random variables X, Y, Z have a joint distribution 
of the form p{x, y, z) — f{x, y)f{y, z), then the Markov chain X — Y — Z holds. The proof appears in Appendix 

m 

III. Problem definitions and main results 

Here we formally define the two-way rate-distortion problem with a helper and present a single-letter charac- 
terization of the achievable region. We use the regular definitions of rate distortion and we follow the notation of 
[17]. The source sequences {Xi E X, i = 1,2,- ■ ■}, {Zi G Z, i = 1,2,- ■ ■} and the side information sequence 
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{Yi G y, i ~ 1, 2, • • • } are discrete random variables drawn from finite alphabets X, Z and 3^, respectively. 
The random variables {Xi,Yi, Zi) are i.i.d. ^ p(x,y,z). Let X and Z be the reconstruction alphabets, and 



dx X X X ^ [0, oo), dz Z X Z 
is defined in the usual way 



[0, oo) be single letter distortion measures. Distortion between sequences 

1 " 
n ^-^ 

i—l 
1 " 

= -Vd(z„z,)- (8) 
n ^ — ' 

i—l 

Let Aii, denote a set of positive integers {1, 2, .., Mi} for i = 1,2, 3. 

Definition 1: An [n. Mi, M2, M3, D^, Dz) code for two source X and Z with helper Y consists of three encoders 



and two decoders 



/i 

/2 

h 

92 
53 



Z" xMi^M2 

A"" X TWi X AI2 ^ i" 
Z" X Ml X TWg -> A-" 



(9) 



(10) 



such that 



E 



< Dz 



The rate triple {Ri, R2, R3) of the (n, Mi, M2, M3, D^, Dz) code is defined by 

R^ = -logAf,; i = 1,2,3. 



(11) 



(12) 



Definition 2: Given a distortion pair {Dx,Dz), a rate triple (i?i,i?2,^3) is said to be achievable if, for any 
e > 0, and sufficiently large n, there exists an (n, 2"^\ 2"^^, 2"^^, + e,Dz + e) code for the sources X,Z 
with side information Y. 

Definition 3: The (operational) achievable region TZ'^ {D^, Dz) of rate distortion with a helper known at the 
encoder and decoder is the closure of the set of all achievable rate pairs. 
The next theorem is the main result of this work. 

Theorem 1: In the two way-rate distortion problem with a helper, as depicted in Fig.[Tl where Y — X — Z, 

n°{D^,Dz)^n{D^,Dz), (13) 
where the region TZ{Dx, Dz) is specified in ([T])-(|5]l. 

Furthermore, the region TZ{Dx, Dz) satisfies the following properties, which are proved in Appendix IbI 
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Lemma 2: 1) The region TZ{Dx, D^) is convex 

2) To exhaust TZ{Dx, D^), it is enough to restrict the alphabet of U, V, and W to satisfy 

M < 13^1+4, 
|V| < \Z\\U\+3, 

\W\ < IZ^IIVIIA-I + 1. (14) 

Before proving the main result (Theorem [T]!, we would like to consider two special cases, first where R2 = 
and Dz — 00 and second where R3 ~ and = 00. The main techniques and insight are gained through those 
special cases. Both cases are depicted in Fig. [3] where in the first case we assume the Markov form Y — X — Z 
and in the second case we assume a Markov form Y — Z — X. 

The proofs of these two cases are quite different. In the achievability of the first case, we use a Wyner-Ziv code 
that is designed only for the decoder, and in the achievability of the second case we use a Wyner-Ziv code that 
is designed only for the encoder. In the converse for the first case, the main idea is to observe that the achievable 
region does not increase by letting the encoder know Y, and in the converse of the second case the main idea is 
to use the chain rule in two opposite directions, conditioning once on the past and once on the future. 

Z 



X 



Encoder 



R 



Decoder 



Ri 



X 



Helper 



Y 



Fig. 3. Wyner-Ziv problem with a helper . We consider two cases; first the source X, Helper Y and the side information Z form the Markov 
chain Y — X — Z and in the second case they form the Markov chain Y — Z — X. 



IV. Wyner-Ziv with a helper where Y-X-Z 

In this Section, we consider the rate distortion problem with a helper and additional side information Z, known 
only to the decoder, as shown in Fig. |3] We also assume that the source X, the helper Y, and the side information 
Z, form the Markov chain Y — X — Z. This setting corresponds to the case where R2 — Q and — 00. Let us 
denote by TZy^x-zi^) (operational) achievable region 'RP{Dx = D, — 00). 

We now present our main result of this section. Let TZy-x-z{D) be the set of all rate pairs (i?, Ri) that satisfy 

i?i > I{U-Y\Z), (15) 
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R > I{X;W\U,Z), (16) 

for some joint distribution of the form 

p{x,y,z,u,v) = p{x,y)p{z\x)p{u\y)p{w\x,u), (17) 

Ed^{X,X(U,W,Z)) < D, (18) 



where W and V are auxiliary random variables, and the reconstruction variable X is a deterministic function of 
the triple {U, W, Z). The next lemma states properties of TZx-y-z{D). It is the analog of Lemma|2]and the proof 
is omitted. 

Lemma 3: 1) The region TZx-Y-ziD) is convex 

2) To exhaust TZx-y~z{D), it is enough to restrict the alphabets of V and U to satisfy 

m < \y\ + 2 

m < iA'i(i3;| + 2) + i. (19) 

Theorem 4: The achievable rate region for the setting illustrated in Fig. [3] where X, Y, Z are i.i.d. random 
variables forming the Markov chain Y — X ~ Z is 

T^Y-x-ziD) - TlY-x-ziD). (20) 

Let us define an additional region Tix-Y-z{D) the same as TZx~y~z{D) but the term p{w\x,u) in ( fTTI l is 
replaced by p{w\x, u, y), i.e., 

p{x, y, z, u, w) = p{x, y)p{z\x)p{u\y)p{w\x, u, y). (21) 

In the proof of Theorem]?] we show that TZy-x-z{D) is achievable and that TZy-x-z(D) is an outer bound, 
and we conclude the proof by applying the following lemma, which states that the two regions are equal. 

Lemma 5: TZx-y-z{D) = TZx-y-z{D). 

Proof: Trivially we have TZx-y-z{D) 3 TZ{D\Z). Now we prove that TZx-y-z{D) C TZx-y-z{D). Let 
{R,Ri) eTZx-Y-z{D), and 

p{x, y, z, u, w) = p{x, y)p{z\x)p{u\y)p{w\x, u, y) (22) 

be a distribution that satisfies (fT5]l,(fT6]l and ( fTSl l. Now we show that there exists a distribution of the form ( fTTl ) 
such that ([ISll,® and ^ hold. 
Let 

p{x, y, z, u, w) = p{x, y, z)p{u\y)p{w\x, u), (23) 

where p{w\x,u) is induced by p{x,y,z,u,'w). We now show that the terms I{U;Y\Z), I{X;W\Z,U) and 
E,d{X, X{U,W, Z)) are the same whether we evaluate them by the joint distribution p{x,y,z,u,w) of (l23T l. or 
by p{x, y, z, u, w); hence {R, Ri) G TZx-y-z{D). In order to show that the terms above are the same it is enough 
to show that the marginal distributions p{y,z,u) and p{x, z,u,w) induced by p{x,y, z,u,w) are equal to the 
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marginal distributions p{y, z, u) and p{x, z, u, w) induced by p{x, y, z, u, w). Clearly p{y, u, z) — p{y, u, z). In the 
rest of the proof we show p{x, z, u, w) = p{x, z, u, w). 

A distribution of the form p{x, y, z, u, w) as given in (l22T i implies that the Markov chain W — {X, U) — Z holds 
as shown in Fig. |4] Therefore p{w\x, u, z) — p{w\x, u). Now consider p{x, z, u, w) = p{x, z, u)p{w\x, u), and since 



W 




Fig. 4. A graphical proof of the Markov chain W — {X, U) — Z. The undirected graph con'esponds to the joint distribution given in (22), 
i.e., p{x,y, z,u,v,w) = p{x,y)p{z\x)p{u\y)p{w\u,x,y). The Markov chain holds since there is no path from Z to W that does not pass 
through {X, U). 



p{x, z, u) = p{x, z, u) and p(w\x, u) — p(w\x, u) we conclude that p(x, z, u, w) — p{x, z, u, w). ■ 
Proof of Theorem |4} 

Achievability: The proof follows classical arguments, and therefore the technical details will be omitted. We 
describe only the coding structure and the associated Markov conditions. Note that the condition ( fTTj l in the definition 
of TZx-Y-z{D), implies the Markov chain U — Y — X — Z. The helper (encoder of Y) employs Wyner-Ziv coding 
with decoder side information Z and external random variable U, as seen from ( fTSl) . The Markov conditions required 
for such coding, U — Y — Z, are satisfied, hence the source decoder, at the destination, can recover the codewords 
constructed from U. Moreover, since ( fTTI l implies U — Y — X — Z, the encoder of X can also reconstruct U (this 
is the point where the Markov assumption Y — X — Z is needed). Therefore in the coding/decoding scheme of 
X, U serves as side information available at both sides. The source (X) encoder now employs Wyner-Ziv coding 
for X, with decoder side information Z, coding random variable W, and U available at both sides. The Markov 
conditions needed for this scheme are W — {X, U) — Z, which again are satisfied by ( fTTI l. The rate needed for this 
coding is I{X;W\U, Z), reflected in the bound on R in ( fTSI l. Once the two codes (helper and source code) are 
decoded, the destination can use all the available random variables, U, W, and the side information Z, to construct 
X. 

Converse: Assume that we have an {n,Mi = 2"^i,Af2 = 1,-^/3 — 2^^^,Dx — D.Dz = 00) code as in 
Definition m We wifl show the existence of a triple {U,W,X) that satisfy ([T5ll-([T8ll. Denote Ti = e 
{l,...,2"«i}, and r = /3(X",Ti) G {1,...,2"«}. Then, 

nRi > H{Ti) 

> H{Ti\Z'^) 

> I{Y";Ti\Z'') 

n 

= J2h{y,\z,)~h{y,\y'-\t,,z") 

i=l 
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(a) 

i=l 
n 

> J2h{Y,\Z,)-H{Y,\X'~\Ti,Z''), (24) 
1=1 

where equality (a) is due to the Markov form Yi — (Y^^^, Z") — X*^^. Furthermore, 

riR > H{T) 

> iJ(T|Ti,Z") 

> /(x";r|ri,z") 

n 

^ '^H{X^\Ti,Z'\X'-^) - H{X^\T,Ti,Z'\X'-'^) (25) 
1=1 

Now, let W^^T and C/^ = Ti), where Z"V' denotes the vector Z" without the i*'' element, i.e.. 

Then (|2l and ^ become 

1 " 

Ri > -y^I{Y■,U^\Z,) 

n ^ — ^ 



n 

1=1 



1 " 

R > -Y^I{X^■W^\U,,Z^). 
rt ^ — ^ 



n 

i=l 



(26) 



Now we observe that the Markov chain Ui - Y^ - (X^, Z,) holds since we have Z"\»,Ti(r")) - - 

(Xi, Zi). Also the Markov chain W^ - {U^, X.„ Y,) - Z, holds since r(Ti, X") - (X*, F^, Ti(y"), Z"\') - Z,. The 
reconstruction at time i, i.e., X,;, is a deterministic function of {Z" ,T,Ti), and in particular it is a deterministic 
function of {Ui, Wi, Zi). Finally, let Q be a random variable independent of X", F", Z", and uniformly distributed 
over the set {1, 2, 3, .., n}. Define the random variables U = (Q, Uq), W = (Q, VFq), and X = (Xq) (Xq is a 
short notation for time sharing over the estimators). The Markov relations U — Y — {X, Z) and W — {X, U, Y) — Z, 
the inequaUty Ed{X, X) = J27=i X,) < D, the fact that X is a deterministic function of {U, W, Z) , and 

the inequalities i?i > I{Y; U\Z) and R > I{X, Y; W\U, Z) (implied by (|26|), imply that (i?, Ri) G TZx-y-z{D), 
completing the proof by Lemma |5] ■ 

V. Wyner-Ziv with a helper where Y - Z - X 

Consider the the rate-distortion problem with side information and helper as illustrated in Fig. [3] where the 
random variables X, Y, Z form the Markov chain Y — Z ~ X. This setting corresponds to the case where R^ = 
and exchanging between X and Z. Let us denote by TZy^z-xi^) '^^e (operational) achievable region. 

Let TLy-z-x{D) be the set of all rate pairs {R, Ri) that satisfy 

Ri > IiU;Y\X), (27) 

R > IiX;V\U,Z), (28) 

for some joint distribution of the form 

p{x,y,z,u,v) = p{z,y)p{x\z)p{u\y)p{v\x,u), (29) 
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Ed{X,X{U,V,Z)) < D, (30) 

where U and V are auxiliary random variables, and the reconstruction variable X is a deterministic function of 
the triple {U, V, Z). The next lemma states properties of TZy-z-x{D). It is the analog of Lemma|2]and therefore 
omitted. 

Lemma 6: 1) The region TZy-z-xiD) is convex 

2) To exhaust TIy-z-x{D), it is enough to restrict the alphabets of V and U to satisfy 

\u\ < \y\ + 2 

\V\ < \X\{\y\+2) + l. (31) 

Theorem 7: The achievable rate region for the setting illustrated in Fig. [3] where Xi,Yi,Zi are i.i.d. triplets 
distributed according to the random variables X, Y, Z forming the Markov chain Y — Z — X is 

T^Y-z^xiD) = -Ry-z-xiD). (32) 

Proof: 

Achievability: The proof follows classical arguments, and therefore the technical details will be omitted. We 
describe only the coding structure and the associated Markov conditions. The helper (encoder of Y) employs 
Wyner-Ziv coding with decoder side information X and external random variable U, as seen from ( l27l ). The 
Markov conditions required for such coding, U — Y — X, are satisfied, hence the source encoder, at the destination, 
can recover the codewords constructed from U . Moreover, since ( |29l ) implies U ~Y — Z — X , the decoder, at the 
destination, can also reconstruct U . Therefore in the coding/decoding scheme of X,U serves as side information 
available at both sides. The source X encoder now employs Wyner-Ziv coding for X, with decoder side information 
Z, coding random variable V , and U available at both sides. The Markov conditions needed for this scheme are 
V — (X, U) — Z, which again are satisfied by ( |29] |. The rate needed for this coding is I{X] V\U, Z), reflected in 
the bound on R in ( |28] |. Once the two codes (helper and source code) are decoded, the destination can use aU the 
available random variables, U, V, and the side information Z, to construct X. 

Converse: Assume that we have a code for a source X with helper Y and side information Z at rate (_Ri, R). 
We will show the existence of a triple {U,V,X) that satisfy (Elll-®. Denote ri(r") e {1, 2"^i}, and 
r(X",Tl) e {!,..., 2"-"}. Then, 

nRi > H{Ti) 

> i/(ri|x") 

> /(r";ri|x") 

= J2h{y,\x,)-h{y,\y'-\Ti,x") 



(a) 

i=l 



Y,H{Y,\X,) - H{Y,\Y'-\T^,X^+^,X, 
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i=l 
(c) " 

i=l 

where (a) and (b) follow from the Markov chain Y, - (F^-i, ri(r"), ) - {X'-^,Z^-^) (see Fig. |5] for the 



Z"^-^ 




Fig. 5. A graphical proof of tlie Markov chain Yi — (V"^, Ti{Y"), X") — Z^^^). The undirected graph coiresponds to the joint 

distribution p{x^~^ , z^~^)p{y'^^^\z^^^)p{xi, Zi)p(yi\zi)p{x"_^-^^, •^I+i)p{j/r+i kl+i)p{*i I J/")- The Markov chain holds since all paths from 
Yi to X^~^ , Z^~^ pass through {Y^^^ , Ti(y"), X"). The nodes with the open circle, i.e., o, constitute the middle term in the Markov chain, 
i.e., Ti(y"), X") and all the other nodes are with solid circles, i.e., •. The nodes Y^~^, Yi, Y^-^ and T\ are connected due to the 

term p(ii \y"). 

proof), and (c) follows from the fact that conditioning reduces entropy. Consider, 

nR > H{T) 

> H{T\Ti,Z") 

> I{X";T\Ti,Z") 

n 

= ^i7(x,|xr+i,ri,z") -ij(x,|x;;i,Ti,z",r) 

i=l 

= Ti, Z"" \ Zj) - iJ(Xi|Xj'Yi,Ti, Z",T) 

i=l 
(b) ^ 

> ^ff(x,|x,'Vi,ri,z'-i,z,)-i/(x,|x;;i,Ti,z*-i,z„T), (34) 

i=l 

where (a) is due to the Markov chain Xi — (XJ^^j^, Ti{Y^), Z') — Z^-^-l (this can be seen from Fig. |5]since all paths 
from Xi to Zj'!|_i goes through Zi), and (b) is due to the fact that conditioning reduces entropy. Now let us denote 
Ui ^ ri(r"), X^^i, and Vi ^ T(X", Ti). The Markov chains (7, -F, - (X„ Z,) and - (X„ C/,) - (Z,;, y,) 

hold (see Fig. |6]for the proof of the last Markov relation). 

Next, we need to show that there exists a sequence of function Xi{Ui,Vi, Zi) such that 

1 " 

-y^¥\d{Xi,Xi{Ui,Vi,Zi))\<D. (35) 
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Fig. 6. A graphical proof of the Mai'kov chain X^-^ - (Z'-^ , Ti (Y"), Xf) - (Zi, Yi), which implies Vi - {Xi,Ui) - {Zi,Yi). The 
undirected graph corresponds to the joint distribution p(x''^^, z'~^)p{y^^^\z'^^)p(xi, |2:i)p(a;"_j_-^, 2^x)P(3'I+l l^")- The 

Markov chain holds since all paths from X^-'^ to {Zi,Yi) pass through Ti (Y"), ). 



By assumption we know that there exists a sequence of functions Xi{T,Ti,Z'^) such that 
J2^=i^[d{Xi,Xi{T,Ti,Z^^))] < nD, and trivially this imphes that there exists a sequence of functions 
XiiX'^\T, Ti, Z") such that 



(36) 



i=l 



Note that the Markov chain X, - {Xf_^^,Ti, Z\T) - Zf^^ holds (see Fig. [7] for the proof). Therefore, for an 
arbitrary function / of the form f{X^_^j^,Ti,Z^,T) we have 

n n 

Y,nd{X,,Mxr+i,T,T,,Z\Z^_^J}] < min^E[d(X„X,(Xr+i,T,ri,Z\/(Xr+i,Ti,Z\T)))], (37) 

i=l f i=l 

and since each summand on the RHS of dJTl i includes only the random variables {X^^^jTjTi, Z^) we conclude 
that there exists a sequence of functions {Xi{X"_^i,T,Ti, Z^)} for which dSSl l holds. 



T(X",Ti 




Fig. 7. A graphical proof of the Markov chain Xi — {X"^-^^ , Ti, Z*, T) — Z"_^-^^. The undirected graph con'esponds to the joint distribution 
|z'~^)p(xi, 2i)p(yi|2i)p(x"^j^, z^j^)p(j/t^j^|2T^-l^)p(tl|2/"')p(t|a;", ti). The Markov chain holds since all paths from 
Xi to pass through {X:^_^^,Ti, Z\T). 



Finally, let Q be a random variable independent of and uniformly distributed over the set 

{1,2,3, ..,n}. Define the random variables U = {Q,Uq), W = {Q,Wq), and X = Xq (Xq is a short notation 
for time sharing over the estimators). Then (l33]l-(l35ll implies that (l27l)- (l30l) hold. ■ 
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VI. Proof of Theorem[T] 

In this section we prove Theorem [T] which states that the (operational) achievable region TiP {Dx, Dz) of the 
two-way source coding with helper problem as in Fig. [T] equals TZ{Dx, Dz). In the converse proof we use the ideas 
used in proving the converses of Theorems |4] and [T] Namely, we will use the chain rule based on the past and 
future, and will show that 'RP{Dx,Dz) C n{Dx,Dz), where n{Dx,Dz) is defined as n{Dx,Dz) in dTJ-© but 
with one difference; the term p{w\u, v, x) in ^ should be replaced by p{w\u, v, x, y), i.e.. 



y, z, u, V, w) = _p(x, y)p{z\x)p{u\y)p{v\u, z)p{w\u, v, x, y). 



(38) 



The following lemma states that the two regions TZ{Dx, Dz) and TZ{Dx, Dz) are equal. 
Lemma 8: lZ{Dx, Dz) = TZ^Dx, Dz). 

Proof: Trivially we have n{Dx,Dz) 3 n{Dz,Dz). Now we prove that Tl{Dx,Dz) C TZ{Dx,Dz). Let 
{Ri,R2,B3) en{Dx,Dz), and 

p{x, y, z, u, V, w) = p{x, y)p{z\x)p{u\y)p{v\u, z)p{w\u, v, x, y), (39) 

be a distribution that satisfies ([T]i-(l3]i and (|5]l. Next we show that there exists a distribution of the form of (|4| (which 
is explicitly given in d39] l) such that ([rii-([3]l and (|5j hold. Let 

p{x, y, z, u, V, w) = p{x, y)p{z\x)p{u\y)p{v\u, z)p{w\u, v, x), (40) 

where p{w\u,v,x) is induced by p{x,y, z,u,v). We show that all the terms in ([il)-© and ^ i.e., I{Y;U\Z), 
I{Z- V\U, X), Edz{Z, Z{U, V, X)), I{X; W\U, V, Z), and Edx{X, X{U, W, Z)) are the same whether we evaluate 
them by the joint distribution p{x,y, z,u,v) of ( l40b . or by p{x,y, z,u,v,w) of ( [39] l; hence (Ri, R2, R3) G 
TZ{Dx, Dz). In order to show that the terms above are the same it is enough to show that the marginal distributions 
p{x, y, z, u, v) and p{x, z, u, v, w) induced hy p{x, y, z, u, v, w) are equal to the marginal distributions p{x, y, z, u, v) 
and p{x, z, u, v, w) induced by p{x, y, z, u, v, w). Clearly p{x, y, z, u, v) — p{x, y, z, u, v). In the rest of the proof 
we show p{x, z, M, w, w) = p{x, z, u, v, w). 




Fig. 8. A graphical proof of the Markov chain W — {X, U, V) — Z. The undirected graph corresponds to the joint distribution given in )39K 
i.e., p{x, y, z, u, v, w) = p{x, y)p{z\x)p{u\y)p(v\u, z)p{w\u, v, x, y). The Markov chain holds since there is no path from Z to W that does 
not pass through {X, U, V). 



A distribution of the formp(x, y, z, u, w) as given in ( [39] l implies that the Markov chain VF— (X, [/, V)~Z holds 
(see Fig.[8]for the proof). Therefore p{w\u, x, v, z) = p{w\u, x, v). Since p{x, z, u, v, w) = p{x, z, v, u)p[w\x, u, v), 
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and since p{x,z,v,u) — p{x,z,v,u) and — p{w\x,w,v) we conclude that p{x, z,u,v,w) = 



proof of Theorem Q} 

Achievability: The achievabihty scheme is based on the fact that for the two special cases considered above, 
namely i?2 = and i?3 = 0, the coding scheme for the helper was based on a Wyner-Ziv scheme, where the side 
information at the decoder is the random variable that is "further" in the Markov chain Y — X — Z , namely Z. The 
helper (encoder of Y) employs Wyner-Ziv coding with decoder side information Z and external random variable U , 
as seen from ([T]i, i.e., i?i > I{Y\ U\Z). The Markov conditions required for such coding, U — Y — Z , are satisfied, 
hence the source decoder, at the destination, can recover the codewords constructed from U . Moreover, since ( |29l ) 
implies U — Y — Z — X , the encoder of X can also reconstruct U . Therefore in the coding/decoding scheme of X, 
U serves as side information available at both sides. The source Z encoder now employs Wyner-Ziv coding for Z, 
with decoder side information X, coding random variable V, and U available at both sides. The Markov conditions 
needed for this scheme are V — {X, U) — Z, which again are satisfied by (01). The rate needed for this coding is 
I{X\ V\U, Z), reflected in the bound on i?2 in (|2|i. Finally, the source X encoder now employs Wyner-Ziv coding 
for X, with decoder side information Z, coding random variable W, and t/, V available at both sides. The Markov 
conditions needed for this scheme are 14^ — {X, U, V) — Z, which again are satisfied by (01). The rate needed for 
this coding is I{X; W\U, V, Z), reflected in the bound on i?3 in (O. Once the codes are decoded, the destination 
can use all the available random variables, {U, V, X) at User X, and, {U, W, Z) at User Z, to construct Z and X, 
respectively. 

Converse: Assume that we have a {n, Mi, M2, M^, D^, D^) code. We now show the existence of a triple 
{U, V, W, X, Z) that satisfy ((B-dSl). Denote Ti = T2 = /2(^", Ti), and = /3(X", T2, Ti). Then using 

the same arguments as in (l33l) and ( |34|) (just exchanging between X and Z), we obtain 



p(x, Z, U, V, w). 



n 




(41) 



n 



nR2 > ^i/(Z,|Z;Vi,Ti,X^-i,X0-i?(Z,|Z;Vi,Ti,X'-i,X„r2), 



(42) 



i=l 



respectively. For upper-bounding R3, consider 



ni?3 > 



> 



H(T3|ri,r2,^") 



> 



I{X'';n\Ti,T2,Z") 



n 



^ H{X,\X'-\ Z'\Ti.T2) - H(X,\X'-\,Z", T1.T2, T3) 



n 



(a) 



H{X,\X'-\ Zl\Ti,T2) - H(X,\X'-\.Z", T1.T2, T3) 



1=1 
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(43) 



i=l 



where equality (a) is due to the Markov chain Xi — {X^ ^ , ,Ti,T2) — ^ (see Fig. |9]l. Now let us denote 



T2iZ",Ti 




Ti(y") 



Fig. 9. A graphical proof of the Markov chain Xi — {X' ^,Z",Ti,T2) — Z' ^. The undirected graph corresponds to the joint distribution 

p{x^~^, z^~^)p{y'^^^ \x'^^^)p(xi, Zi)p{yi\xi)p(xf^-^ , z"^-^^)p{y"^-^^\x"^-^)p(ti \y")p{t2\z" , ti). The Markov chain holds since all paths from 
Z»-i to pass through Z,", Ti, T2). 



U, = Ti, Zf+i, V, = T2 and = Tg, and we obtain from (|4T]l-(|43ll 



1 " 

Rl > -Y,HY^;U^\Z^), 

2 = 1 
1 " 

n — ' 

i=l 
1 " 

-R3 > -y2l{X,;W,\U„V,,Z,), 



(44) 



Now, we verify that the joint distribution of {Xi, Yi, Zi, Ui, Vi, Wi) is of the form ( [38] l. i.e., Ui — Yi ~ (Zi, Xi), 
V,~{U^, Z,)-{Y„ Xi) and W,-{\J„ V„X„Yi)-Z„ hold. The Markov chain (Ti(r"), X^^^, Z.'^J-K, - (Z„ X,) 
trivially holds, and the Markov chains 

z^-i _ (ri(r"),x^-\zf) - {Y„x,), (45) 
xr+i - (ri(r"), T2(ri, r'),x\ z2+^,y,) - z, m 

are proven in is proven in Fig. [TOl [TT] respectively. Next, we show that exist sequences of functions 

{Z,{U„ Wr,Zi)}, and {X,{U^, V„Z,)} such that 

1 " 

-yE[d{X,,X,{U^,V^,Z,))] < D„ 



1 = 1 



1 " 

- VE[d(X„Z,([/„iy„X,))] < i^,. 



(47) 



The only difficulty here is that the terms in (Ui,Vi,Zi) do not include and the terms {Ui,Wi, Xi) do not 

include X"^j^. However, this is solved by the same argument as for the Wyner-Ziv with helper at the end of Section 
lYl by showing the Markov forms Xj - {Ui,Vi, Zi) - Z^^'^ and Z^ - {Ui,Wi, Xi) - X^j^^ for which the proofs are 
given in Figures [12] and [13] respectively. 
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Fig. 10. A graphical proof of the Mai'kov chain ^ — {Ti{Y"), ^i^?) ~ The undirected graph corresponds to the joint 

distribution p(x'~^, z^~^)p{y^~^\x^~^)p{xi, Zi)p{yi \xi)p{x"_^_j^, z"^-^)p(y"^j^\x2^-^)p{ti | J/"). The Markov chain holds since all paths from 
Z'-i to {Xi,Yi) pass through {X^-'^ , ,Ti). 




Fig. 11. A graphical proof of the Markov chain X^_^-^ ~ (Ti (Y"), T2{Ti , Z"), X», Z^"_^^, F,) - Z;. The undirected graph corresponds 
to the joint distribution p{x^~^ ,y^~^)p{z^~^\y'~^)p{xi,yi)p(zi\yi)p{x"_^_j^, y"^-^^)p{z"^j^\yf_i_-^)p{ti\y")p{t2\z^ ,ti). The Markov chain 
holds since all paths from Z' to X^^^ pass through (Ti (Y"), T2(Ti , Z"), X*, Zf,^! , Fi). 



T2{Z'\Ti 




Ti(y") 



Fig. 12. A graphical proof of the Markov chain Z'"! - (Ti (Y"), T2 (Ti , Z"), X^-^ , Zf ) - Xi. The undirected graph corresponds to the 
joint distribution p(2^'~'", 2'~"'")p(y*~"'" 2i)p(yi j , 2:^ j |y")p(t2 ti). The Markov chain holds 

since all paths from Z'"! to X, pass through (Ti (Y"), T2{Ti , Z"), X'"! , Zf ). 



Finally, let Q be a random variable independent of and uniformly distributed over the set 

{1,2,3, ..,n}. Define the random variables U = (Q,C/q), V = {Q,Vq), W = {Q,Wq), X ^ Xq, and Z ^ Zq. 
Then (|44li-(|47J imply that the equations that define TZ{Dx,Dz) i.e., ([TJ-©, hold. 
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Fig. 13. A graphical proof of the Markov chain Zi — {Ui,Wi, Xi) — The undirected graph corresponds to the joint distribution 

p{x''~^ , z''~^)p{y^~^\x^~^)p{xi, Zi)p{yi\xi)p(x"_^-^^, zV-_^-^^)p{yf_^-^ l^i+i)p(ti \y")p{'t3\x",ti). The Markov chain holds since all paths from 
to pass through {Ti{Y"),T3{Ti,X"),X\Zr]^^). 



VII. Two-way multi stage 

Here we consider the two-way multi-stage rate-distortion problem with a helper. First, the helper sends a common 
message to both users, and then users X and Z send to each other a total rate Rx and Rz, respectively, in K 
rounds. We use the definition of two-way source coding as given in [1], where each user may transmit up to K 
messages to the other user that depends on the source and previous received messages. 

Let M denote a set of positive integers {1, 2, .., M} and let the collection of K sets {Mi, M2, Mk}- 

X Z 



User X 



Rx,k 
Rz,k 

Ry 



User Z 



X 



Helper 



Y 



Fig. 14. The two-way muW-stage with a helper. First Helper Y sends a common message to User X and to User Z at rate Ry, and then we 

have K rounds where in each round k € {1, ■■■,K} User Z sends a message to User X at rate Rz^k^ ™d User X sends a message to User Z at 
rate Rx,k- The limitation is on rate Ry and on the sum rates Rx = Ylk=i ^x,k ^nd Rz = Ylk=i ^z,k- We assume that the side information 
Y and the two sources X, Z are i.i.d. and form the Markov chain Y — X — Z. 



Definition 4: An {n, My,M^ ,Mf ,Dx,Dz) code for two sources X and Z with helper Y consists of the 
encoders 

fy ■■ y'^My 
fz,k : X M^-^ X My ^ Mz,k, k = l,2,...,K 
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: X"^ X M'', X My ^ M,,k, k = l,2,...,K 



(48) 



and two decoders 



: X"" xMyX Mf 2" 



(49) 



such that 



E 



E 



=1 

n 



i=l 



(50) 



The rate triple Ry,Rz) of the code is defined by 



Ry = -log My; 
1 ^ 

i?.: = -VlogM^,i; 
n ^ 
1=1 

1 ^ 

Rz = -VlogM^,i; 



(51) 



i=l 



Let us denote by TZ^{Dx, D^) the (operational) achievable region of the multi-stage rate distortion with a helper, 
i.e., the closure of the set of all triple rate {Rx,Ry,Rz) that are achievable with a distortion pair {Dx,Dz). Let 
T^k{Dx, Dz) be the set of all triple rates {Rx, Ry,Rz) that satisfy 



Ry > I{U;Y), 



K 



k=l 
K 

Rx > 5^/(X;Wfe|z,^7,y^w^'=-l), 

fe=i 

for some auxiliary random variables {U, , W'') that satisfy 

U-Y-{X,Z), 

Vk - {Z, U, V>'-\W''-') - {X, Y), fc = 1, 2, K, 

Wk - {X, U, y^ W''-^) - {Z, Y), fc = 1, 2, K, 

Edx{X,X{U,W'^,Z)) < Dx, 
Ed,{Z,Z{U,V'',X)) < D,. 



(52) 
(53) 

(54) 

(55) 
(56) 
(57) 



(58) 
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The Markov chain Y — X — Z and the Markov chains given in (|55]l-(l57]i imply that the joint distribution of 

X, Y, Z, U, , is of the form p{x, y)p{z\x)p{u\y) HfeLi -P(''^fcl^' v^~^ ,w^~^)p{'Wk\x, u, , w^~^). Further- 
more, ( |53] | and ( l54l l can be written as 

R. > I{Z;V'',W''\X,U), (59) 
> I{X;V^,W''\Z,U), (60) 

due to the the Markov chains Z - {X, U, , W^-^) - Wu and X - {Z, U, V''~^,W''-^) - Vfc. 
Lemma 9: 1) The region TZk{Dx, Dz) is convex 

2) To exhaust TZk{Dx, D^), it is enough to restrict the alphabet of U, V, and W to satisfy 

\U\ < \y\+2K+l, 

\Vk\ < \Z\\U\\V''-^\\W''-^\ + 2{K + l-k) + l, fork = l,..,K, 
\Wk\ < \X\\U\\V''\\W''-^\+2{K +l-k), fovk^l,..,K. (61) 

The proof of the lemma is analogous to the proof of Lemma |2] and therefore omitted. 

Theorem 10: In the two-way problem with K stages of communication and a helper, as depicted in Fig. [14] 
where Y - X - Z, 

n'^{Dx,D,)^nKiDx,Dz). (62) 

Theorem [TO] is a generaUzation of Theorem [T] (equations (l52]i-(l58Tl where K = 1 are equivalent to ([TJi-Q) and 
its proof is a straightforward extension. Here we explain only the extensions. 

Sketch of achievability: In the achievability proof of Theorem [T] we generated the sequences {U",V{^,Wi) 
that are jointly typical with X", y", Z". Using the same idea of Wyner-Ziv coding we continue and generate at any 
stage k ~ 1,2,...,K, the sequence VJ!* that is jointly typical with the other sequences by transmitting a message 
at rate I{Z; Vk\X, U, V''~^, W^^^) from User Z to User X, and similarly the sequence WJ^ that is jointly typical 
with the other sequences by transmitting a message at rate I{X; Wk\Z, U, ^ W^^^) from User X to User Z. In 
the final stage. User X uses the sequences (X", [/", V7*, V^) to construct Z" and, similarly. User Z uses the 
sequences (Z", J/", W^, W^) to construct X". 

Sketch of Converse: Assume that we have an (n, Mj,, M^, , D^, Dz) code and we will show the existence 
of a vector {U,V^ , X , Z) that satisfy (|52ll-(|58]l. Denote Ty = /y(y"), Tz,k = fzAZ'',Ty,T^-^), and 
Tx.k = fx,k{X^ ,Ty,T^). Then the same arguments as in (HTt we obtain 

n 

nRy > Y.H{Yr,X'-\Ty,Z^_^,\Z,) (63) 

i=l 

Then we have 



K K 

nRz > H{T,^) = J2 H{TzMTt')>J2 H{TzMTt\Tt'), 

k=l k=l 



(64) 
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K 



K 



(65) 



k=l 



k=l 



Applying the same arguments as in (l42l i and (|43] | on the terms in (|64| | and (|65] |, respectively, we obtain that 

n 
n 



(66) 



We define the auxiliary random variables as [/ = X'^ ^ ,Ty, Zq_^_^, Vk — Tz^k and Wk = T^r./c, where Q is 
distributed uniformly on the integers {1, 2, n}. ■ 

VIII. Gaussian Case 

In this subsection we consider the Gaussian instance of the two way setting with a helper as defined in Section 
Unl and explicitly express the region for a mean square error distortion (we also note that the multi stage option 
does not increase the rate region for this case). 



X = Z + A 



Z 



User X 



Rz 
Rii 



User Z 



X 



Helper 



A^N(0,ai), 
B^N(0,a|), 
Z^N(0,a|), 
ALB LZ, 
square-error distortion 



Y = Z + A + B 

Fig. 15. The Gaussian two-way with a helper. The side information Y and the two sources X, Z are i.i.d., jointly Gaussian and form the Markov 
chain y - X - Z. The distortion is the square error, i.e., dx{X'^,X") = i ELi(^i ~ ^if ^4^", = ^ E?=i(2^i " ^if. 



Since X, Y, Z form the Markov chain Y — X — Z, we assume, without loss of generality, that X = Z + A and 
Y = Z + A + B, where the random variables {A, B, Z) are zero-mean Gaussian and independent of each other, 
where ¥.[A^] = £[5^] ^ cr| and E[Z2] ^ ct|. 



Corollary 11: The achievable rate region of the problem illustrated in Fig. [TS] is 



-R. > ^ log 



(67) 
(68) 



21 



Proof: The converse and achievability of ( |67] i follows from the Gaussian Wyner-Ziv coding [18] result, which 
states that the achievable rate for the Gaussian Wyner-Ziv setting is the same as the case where the side information 
is known to the encoder and decoder. Furthermore, because of the Markov chain Z — X — Y , the rate Ry does not 
have any influence on R^, since this rate is the achievable rate even if Y is known to both users. The achievability 
and the converse for R^ is given in the following corollary. ■ 

Z 

k A^N(0,ai), 
B^N(0,a|), 
Z^N(0,a|), 
ALB LZ, 
square-error distortion 

Fig. 16. Gaussian case: the zero-mean Gaussian random variables A, B, Z are i.i.d. and independent of each other. Their variances are cr^, 
ag and ct^, respectively. The source X and the helper Y satisfy X = A + Z and Y = Z -\- A -\- B. The distortion is the square error, i.e., 

d(x",x") = iEr=i{^.-^.)'- 



X = Z + A — € 2" 



Y = Z + A + B 



Corollary 12: The achievable rate region of the problem illustrated in Fig. [16] is 



R > ^log— ^ ^^^g (69) 

It is interesting to note that the rate region does not depend on cr^- Furthermore, we show in the proof that for 
the Gaussian case the rate region is the same as when Z is known to the source X and the helper Y. 
Proof of Corollary \12\ 

Converse: Assume that both encoders observe Z". Without loss of generality, the encoders can subtract Z from 
X and Y; hence the problem is equivalent to new rate distortion problem with a helper, where the source is A and 
the helper is A + B. Now using the result for the Gaussian case from [7], adapted to our notation, we obtain ( l69l) . 
Achievability: Before proving the direct-part of Corollary [12] we establish the following lemma which is proved 
in Appendix ICl 

Lemma 13: Gaussian Wyner-Ziv rate-distortion problem with additional side information known to the encoder 
and decoder Let {X, W, Z) be jointly Gaussian. Consider the Wyner-Ziv rate distortion problem where the source 
X is to be compressed with quadratic distortion measure, W is available at the encoder and decoder, and Z is 
available only at the decoder The rate-distortion region for this problem is given by 

i?(i^) = ^log%^, (70) 
where cr'^^^yz ~ — E[X|W, Z])'^], i.e., the minimum square error of estimating X from {W, Z). 
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Let V = A + B + Z + D, where D - N{0,al,) and is independent of {A,B,Z). Clearly, we have V-Y-X-Z. 
Now, let us generate V at the source-encoder and at the decoder using the achievability scheme of Wyner [18]. 
Since I{V; Z) < I{V; X) a rate R' = I{V; Y) — I{V; Z) would suffice, and it may be expressed as follows: 

R' = I{V]Y\Z) 

= h{V\Z) - h{V\Y) 

- ilog^i±4^, (71) 
2 cr^ 



and this impUes that 



(72) 



'A 



Now, we invoke Lemma [131 where V is the side information known both to the encoder and decoder; hence a rate 
that satisfies the following inequality achieves a distortion D; 

2 

= ^log^(l- .. . (73) 

Finally, by replacing cr^ with the identity in (|72] | we obtain 

IX. Further results on Wyner-Ziv with a helper where Y - X - Z 

In this section we investigate two properties of the rate-region of the Wyner-Ziv setting ( Fig. [TTl ) with a Markov 
form Y — X — Z. First, we investigate the tradeoff between the rate sent by the helper and the rate sent by the 
source and roughly speaking we conclude that a bit from the source is more "valuable" than a bit from the helper. 
Second, we examine the case where the helper has the freedom to send different messages, at different rates, to 
the encoder and the decoder We show that "more help" to the encoder than to the decoder does not yield any 
performance gain and that in such cases the freedom to send different messages to the encoder and the decoder 
yields no gain over the case of a common message. Further, in this setting of different messages, the rate to the 
encoder can be strictly less than that to the decoder with no performance loss. 

A. A bit from the source-encoder vs. a bit from the helper 

Assume that we have a sequence of (n, 2"^, 2"^^) codes that achieves a distortion D, such that the triple 
(i?, Ri,D) is on the border of the region TZy-x-ziD) (recall the definition of TZy-x-ziD) in (fTSt-lfTTli). Now, 
suppose that the helper is allowed to increase the rate by an amount A' > to i?i + A'; to what rate R — A can 
the source-encoder reduce its rate and achieve the same distortion D7 

Despite the fact that the additional rate A' is transmitted both to the decoder and encoder, we show that always 
A < A'. Let us denote by R{Ri) the boundary of the region TZy-x-z{D) for a fixed D. We formally show that 
A < A' by proving that the slope of the curve R{Ri) is always less than I. The proof uses similar technique as 
in [19]. 
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Fig. 17. Wyner-Ziv problem with a helper where the Markov chain Y — X — Z holds. 



Lemma 14: For any X — Y — Z, D, and the subgradients of the curve R{Ri) are less than 1. 

Proof: Since TZ.y-x-z{D) is a convex set, R{Ri) is a convex function. Furthermore, R{Ri) is non increasing 
in Now, let us define J* (A) as 



J* (A) 



min I{X;W\U,Z) + XI{Y:U\Z), 



(74) 



where V is the set of distributions satisfying p(a;, y, z, u, w, x) = p{x, y)p{z\y)p{u\y)p{w\u, x)p{x\u, w, z), E,d{X, X) < 
D. The line J* (A) = R + XR is a support line of R{Ri), and therefore, A is a subgradient. The value J* (A) 
is the intersection between the support line with slope —A and the axis R, as shown in Fig. [18] Because of the 
convexity and the monotonicity of R{Ri), J* (A) is upper-bounded by i?(0), i.e.. 



J*(A) < min i?(0) = min I{X;W\Z), 

p{x,x,y,z,u,w)G'P p{x,x,y,z,w)G'Pwz 



(75) 



where Vwz is the set of distributions that satisfies p{x,x, z,w) — p{x)p{z\x)p{w\x)p{x\w, z), E,d{X,X) < D. 
In addition, we observe that 

R 




Fig. 18. A support line of R{Ri) with a slope —A. J * (A) is the intersection of the support line with the R axis. 



J*(l) = mill I{X;W\U,Z) + I{Y;U\Z) 

p(x ,y ,z jU^w .x)^'P 
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Fig. 19. The rate distortion problem with decoder side information, and independent helper rates. We assume the Markov relation Y — X — Z 



=-* min I(X,Y;W,U\Z) 

> min I{X;W\Z), 

p{x ,y ,z ,u ,10 .x) 

mill I{X;W\Z), (76) 

p{x,x,y,z,w)S:'Pwz 

where step (a) is due to the Markov chains U - Y - {Z,X) and W - (C/, X) - (y, Z). Combining (|75ll and ( |76] ). 
we conclude that for any subgradient —A, J*{X) < J*{^)- Since J* (A) is increasing in A, we conclude that A < 1. 

■ 

An alternative and equivalent proof would be to claim that, since R{Ri) is a convex and non increasing function, 



A' 



dR 



dRi 



, and then to claim that the largest slope at i?i = is when Y ^ X, which is 1 . For the Gaussian 

Ri=0 



case, the derivative may be calculated explicitly from ( |69] l, in particular for i?i = 0, and we obtain 



2 

A < ^^^A'. (77) 

B. The case of independent rates 

In this subsection we treat the rate distortion scenario where side information from the helper is encoded using 
two different messages, possibly at different rates, one to the encoder and one to the decoder, as shown in Fig. [19] 
The complete characterization of achievable rates for this scenario is still an open problem. However, the solution 
that is given in previous sections, where there is one message known both to the encoder and decoder, provides us 
insight that allows us to solve several cases of the problem shown here. We start with the definition of the general 
case. 

Definition 5: An (n, M, Afe, M^, D) code for source X with side information Y and different helper messages 
to the encoder and decoder, consists of three encoders 



/e : 3^"-.{l,2,...,Afe} 
fa ■ 3^"->{l,2,...,M4 
/ : A'"x{l,2,...,Mj^{l,2,...,M} 
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(78) 

and a decoder 

g : {l,2,...,M}x{l,2,...,Ma}^X^ 

(79) 

such that 

Erf(X",i:") < D. (80) 

To avoid cumbersome statements, we will not repeat in the sequel the words "... different helper messages to the 
encoder and decoder," as this is the topic of this section, and should be clear from the context. The rate pair 

(i?, Re,Rd) of the {n, M, M^, M<j, D) code is 

R = -logM 
n 

Re = -log Me 

n 

Rd = -logA/rf (81) 
n 



Definition 6: Given a distortion D, a rate triple {R, Re, Rd) is said to be achievable if for any (5 > 0, and 
sufficiently large n, there exists an (^n,2"-''^~^^\2"-'^^'+^\2"^^''+^\ D + 5) code for the source X with side 
information Y. 

Definition 7: The (operational) achievable region TZ^{D) of rate distortion with a helper known at the encoder 
and decoder is the closure of the set of all achievable rate triples at distortion D. 
Denote by TZ'^{Re, Rd, D) the section of TZ'^{D) at helper rates {Re, Rd)- That is, 

11° {Re, Rd, D) = {R: {R,Re,Rd) are achievable with distortion D} (82) 

and similarly, denote by TZ{Ri, D) the section of the region TZy-x-z{D), defined in (fTSt-lfTSll at helper rate 
Recall that, according to TheoremlH TZ{Ri, D) consists of all achievable source coding rates when the helper sends 
common messages to the source encoder and destination at rate Ri. The main result of this section is the following. 



Theorem 15: For any Re > Rd, 

n^{Re,Rd,D) ^n{Rd,D) (83) 

Theorem [15] has interesting implications on the coding strategy taken by the helper. It says that no gain in 
performance can be achieved if the source encoder gets "more help" than the decoder at the destination (i.e., 
if Re > Rd), and thus we may restrict Re to be no higher than Rd- Moreover, in those cases where Re = Rd, 
optimal performance is achieved when the helper sends to the encoder and decoder exactly the same message. The 
proof of this statement uses operational arguments. 
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Proof of Theorem [73} Clearly, the claim is proved once we show the statement for R^. = H{Y). In this situation, 
we can equally well assume that the encoder has full access to Y . Thus, fix a general scheme like in Definition |5] 
with i?e = H{Y). The encoder is a function of the form Define T2 = The Markov chain 

Z — X — Y implies that — (X", T2) — y" also forms a Markov chain. This implies, in turn that there exists a 
function (f) and a random variable W, uniformly distributed in [0, 1] and independent of (X",T2, Z"), such that 



F" = 0(x",r2,w'). 

Thus the source encoder operation can be written as 

= /(x",r2,w^) 

implying, in turn, that the distortion of this scheme can be expressed as 



(84) 



(85) 



d(X",X"(/(X",r2,M^),T2,Z")) 



(a) 
(b) 



E 



E 



d(x",x"(/(x",T2,i«),r2,^")) 
(X", T2), T2, z")) 



dw 
dw 



where (a) holds since W is independent of (X",T2, Z"), and (b) by defining 



(86) 



(87) 



Note that for a given ?«, the function f^' is of the form of encoding functions where the helper sends one message 
to the encoder and decoder. Therefore we conclude that anything achievable with a scheme from Definition |5] is 
achievable by time-sharing where the helper sends one message to the encoder and decoder ■ 
The statement of Theorem [15] can be extended to rates slightly lower than Rd- This extension is based on 
the simple observation that the source encoder knows X, which can serve as side information in decoding the 
message sent by the helper. Therefore, any message T2 sent to the source decoder can undergo a stage of binning 
with respect to X. As an extreme example, consider the case where R^ > H{Y\X). The source encoder can fully 
recover Y, hence there is no advantage in transmitting to the encoder at rates higher than H{Y\X); the decoder, 
on the other hand, can benefit from rates in the region H{Y\X) < Rd < H{Y\Z). This rate interval is not empty 
due to the Markov chain Y — X — Z . These observations are summarized in the next theorem. 
Theorem 16: 

1) Let {U,V) achieve a point {R,R') in nY-x-z{D), i.e., 



R = I{X;U\V,Z) 
R' = I{Y;V\Z) = I{V;Y) - I{V;Z) 
D > Ed{X,X{U,V,Z)), 



(88) 
(89) 
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V -Y - X - Z. (90) 

Then {R,Re,R') G TZ^{D) for every satisfying 

Re > I{V;Y\Z)-I{V;X\Z) 

= I{V-Y)~I{V-X). (91) 

2) Let (i?, R') be an outer point of TZy-x-z{D). That is, 

{R,R!)(^nY-x-z{D). (92) 

Then (_R, i?e, i?') is an outer point of TZ^{D) for any R^, i.e., 

(i?,i?e,i?') (^) V i?e. (93) 

The proof of Part 1 is based on binning, as described above. In particular, observe that R^ given in ( l9Tl l is lower 
than R' of dSSl l due to the Markov chain V — Y — X — Z. Part 2 is a partial converse, and is a direct consequence 
of Theorem [15] The details, being straightforward, are omitted. 

Appendix A 

Proof of the the technique for verifying Markov relations 
Proof First let us prove that three random variables X, Y, Z, with a joint distribution of the form 

p{x,y,z) = f{x,y)f{y,z), (94) 

satisfy the Markov chain Y — X — Z. Consider, 

, I . ^ f{^,y)f{y,z) fiy,z) 

and since the expression does not include the argument x we conclude that p{z\y, x) = p{z\y). 

For the more general case, we first extend the sets Xg-^ Xg^. We start by defining Qi ~ Qi and Qj, = Q^^, and 
then we add to Xg^ and to Xg^^ all their neighbors that are not in Xg^ (a neighbor to a group is a node that is 
connected by one edge to the an element in the group). We repeat this procedure till there are no more nodes to 
add to Xg^ or Xg^. Note that since there are no paths from Xg^ to Xg.^ that do not pass through Xg^, then a node 
can not be added to both sets Xg^ and Xg^. The set of nodes that are not in [Xg^^ Xg^, X-g_^ is denoted as Xg^. 

The sets Xg„ and Xg^ and Xg^^ are connected only to Xg^ and not to each other, hence the joint distribution 
of (Xqq , Xg^ , Xg^ , Xg^ ) is of the following form 

p{Xg„ , Xg^ , Xg^ , Xg^ ) = f {Xg„ ,Xg^)f {Xg^ , Xg^ ) / {Xg^^ ..Xg^). (96) 

By marginalizing over Xg^^ and using the claim introduced in the first sentence of the proof we obtain the Markov 
chain Xg^ — Xg^ — Xg^, whcih implies Xg^ — Xg^ — Xg.^. ■ 
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Appendix B 
Proof of Lemma |2] 

Proof: To prove Part[T] let Q be a time sharing random variable, independent of the source triple {X, Y, Z). 
Note that 

I(Y;U\Z,Q) I{Y;U,Q\Z) = I{Y;U\Z), 

I{Z-V\U,X,Q) = I{Z;V\U,X), 
I{X;W\U,V,Z,Q) = I{X;W\U,V,Z), 

where U = {U, Q), and in step (a) we used the fact that Y is independent of Q. This proves the convexity. 

To prove Part |2] we invoke the support lemma [20, pp. 310] three times, each time for one of the auxiliary 
random variables U, V, W. The external random variable U must have \y \ — 1 letters to preserve p{y) plus five more 
to preserve the expressions I{Y;U\Z), I{Z;V\U,X), I{X;W\U,V, Z) and the distortions Ed^{X,X{U,V, Z)) 
¥.dz{Z, Z{U, W, X)). Note that the joint p{x, y, z) is preserved because of the Markov form U — Y — X — Z, and 
the structure of the joint distribution given in (01 does not change. We fix U, which now has a bounded cardinahty, 
and we apply the support lemma for bounding V. The external random variable V must have \U\\Z\ — 1 letters 
to preserve p(u, z) plus four more to preserve the expressions I{Z\ V\U^ X), I(X; W\U, V, Z) and the distortions 
Ed^(X, X{U, V, Z)), Edz{Z, Z{U, W, X)). Note that because of the Markov structure V-{U, Z) - {X, Y) the joint 
distribution p{u, z, x, y) does not change. Finally, we fix U, V which now have a bounded cardinality and we apply 
the support lemma for bounding W. The external random variable W must have |W||V||A'| — 1 letters to preserve 
p{u,v,x) plus two more to preserve the expressions I{X;W\U,V, Z) and the distortions E,dz{Z, Z{U,W, X)). 
Note that because of the Markov structure W — {U, V, X) — {Z, Y) the joint distribution p{u, v, x, y, z) does not 
change. ■ 

Appendix C 
Proof of Lemma[T3] 

Since W, X, Z are jointly Gaussian, we have E[X|VF, Z] = aW + (3Z, for some scalars a, f3. Furthermore, we 
have 

X ^ aW + PZ + N, (97) 

where TV is a Gaussian random variable independent of (W, Z) with zero mean and variance (^xiwz- 'Since W is 
known to the encoder and decoder we can subtract aW from X, and then using Wyner-Ziv coding for the Gaussian 
case [18] we obtain 

2 

= 1 log (98) 

Obviously, one can not achieve a rate smaller than this even if Z is known both to the encoder and decoder, and 
therefore this is the achievable region. 
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